Exploring the Power of Heterogeneous
نویسندگان
چکیده
The big data challenge is one unique opportunity for both data mining and database research and engineering. A vast ocean of data are collected from trillions of connected devices in real time on a daily basis, and useful knowledge is usually buried in data of multiple genres, from different sources, in different formats, and with different types of representation. Many interesting patterns cannot be extracted from a single data collection, but have to be discovered from the integrative analysis of all heterogeneous data sources available. Although many algorithms have been developed to analyze multiple information sources, real applications continuously pose new challenges: Data can be gigantic, noisy, unreliable, dynamically evolving, highly imbalanced, and heterogeneous. Meanwhile, users provide limited feedback, have growing privacy concerns, and ask for actionable knowledge. In this thesis, we propose to explore the power of multiple heterogeneous information sources in such challenging learning scenarios. There are two interesting perspectives in learning from the correlations among multiple information sources: Explore their similarities (consensus combination), or their differences (inconsistency detection). In consensus combination, we focuse on the task of classification with multiple information sources. Multiple information sources for the same set of objects can provide complimentary predictive powers, and by combining their expertise, the prediction accuracy is significantly improved. However, the major challenge is that it is hard to obtain sufficient and reliable labeled data for effective training because they require the efforts of experienced human annotators. In some data sources, we may only have a large amount of unlabeled data. Although such unlabeled information do not directly generate label predictions, they provide useful constraints on the classification task. Therefore, we first propose a graph based consensus maximization framework to combine multiple supervised and unsupervised models obtained from all the available information sources. We further demonstrate the benefits of combining multiple models on two specific learning scenarios. In
منابع مشابه
Jointly power and bandwidth allocation for a heterogeneous satellite network
Due to lack of resources such as transmission power and bandwidth in satellite systems, resource allocation problem is a very important challenge. Nowadays, new heterogeneous network includes one or more satellites besides terrestrial infrastructure, so that it is considered that each satellite has multi-beam to increase capacity. This type of structure is suitable for a new generation of commu...
متن کاملGood, bad and ugly: Exploring the Machiavellian power dynamics of leadership in medical education
Introduction: Medical education requires participation of variousstakeholders and this contributes to power dynamics operating atmultiple levels. Personality traits of an individual can affect thesmooth execution of the educational programmes and eventuallythe professionalism of the environment. With the increased focuson leadership traits in medical education and collaboration inhealth care se...
متن کاملExploring Dialogism and Multivocality in L2 Classroom-Discourse Architecture in Iran
Critical pedagogy (CP), as a poststructuralist educational movement, challenges the asymmetrical, power-over nature of classroom discourse and seeks to accommodate multivocality in the classroom and in the society. This study probed the discourse architecture of EFL classrooms in Iran. Specifically, it aimed to explore to what extent Iranian EFL classrooms have stepped away from the teacher-dom...
متن کاملMining Heterogeneous Information Networks by Exploring the Power of Links
Knowledge is power but for interrelated data, knowledge is often hidden in massive links in heterogeneous information networks. We explore the power of links at mining heterogeneous information networks with several interesting tasks, including link-based object distinction, veracity analysis, multidimensional online analytical processing of heterogeneous information networks, and rank-based cl...
متن کاملStudying the Effect of Horizontal Drains on Stability of Heterogeneous and Homogeneous Earth Dams during Rapid Drawdown Condition
One of the main concerns to design earth dam is the stability of the upstream slope of the earth dam in phase of rapid drawdown. Confined pore water pressure reduces the effective stress in this mode, so possibility of the instability and slippage will be increased. The main goal of this research is to investigate changes in the pore water pressure by using horizontal drains in upstream slope o...
متن کاملBackhaul-Aware Decoupled Uplink and Downlink User Association, Subcarrier Allocation, and Power Control in FiWi HetNets
Decoupling the uplink and downlink user association improves the throughput of heterogeneous networks (HetNets) and balances the traffic load of macro- and small- base stations. Recently, fiber-wireless HetNets (FiWi-HetNets) have been considered as viable solutions for access networks. To improve the accuracy of user association and resource allocation algorithms in FiWi-HetNets, the capacity ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011